On the Performance of Transparent MPI Piggyback Messages
نویسندگان
چکیده
Many tools, including performance analysis tools, tracing libraries and application level checkpointers, add piggyback data to messages. However, transparently implementing this functionality on top of MPI is not trivial and can severely reduce application performance. We study three transparent piggyback implementations on multiple production platforms and demonstrate that all are inefficient for some application scenarios. Overall, our results show that efficient piggyback support requires mechanisms within the MPI implementation and, thus, the interface should be extended to support them. 1 Motivation Tools and support layers often must send additional information along with every message initiated by the main application. In most cases, tools must uniquely associate this additional information, often called piggyback data, with a specific message in order to capture the correct context and to avoid additional communication paths. A wide range of software systems piggyback data onto messages for diverse purposes; we detail a few here. Tracing libraries correlate send and receive events by sending vector clock information [4]. Performance analysis tools attach timing information to detect and analyze critical paths [2] or to compensate for instrumentation perturbation [7]. Application level checkpoint layers transmit epoch identifiers to synchronize global checkpoints [5]. Unfortunately, the MPI standard does not include a transparent piggyback mechanism. Instead, each system must provide its own, often ad-hoc, implementation. While a generic piggyback service could be added to an infrastructure like PMPI [6], the optimal solution depends on the specific usage scenario. In this paper, we study the overhead and tradeoffs of three methods to support piggyback data: – manual packing and unpacking the piggyback data and application payload into the same buffer; – using datatypes with absolute addresses to attach piggyback data to the application payload; and This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344 (LLNL-CONF-402937).
منابع مشابه
Improving End - to - End Performance of the Web Using Server Volumes and Proxy Filters 3
The rapid growth of the World Wide Web has caused serious performance degradation on the Internet. This paper ooers an end-to-end framework by collectively examining the Web components { clients, proxies, servers, and the network. Our goal is to reduce user-perceived latency and the number of TCP connections, improve cache coherency and cache replacement, and enable prefetching of resources tha...
متن کاملComparing the performance of MPICH with Cray's MPI and with SGI's MPI
The purpose of this paper is to compare the performance of MPICH with the vendor Message Passing Interface (MPI) on a Cray T3E-900 and an SGI Origin 3000. Seven basic communication tests which include basic point-to-point and collective MPI communication routines were chosen to represent commonlyused communication patterns. Cray’s MPI performed better (and sometimes significantly better) than M...
متن کاملPiggyback Server Invalidation for Proxy Cache Coherency
We present a piggyback server invalidation (PSI) mechanism for maintaining stronger cache coherency in Web proxy caches while reducing overall costs. The basic idea is for servers to piggyback on a reply to a proxy client, the list of resources that have changed since the last access by the client. The proxy client invalidates cached entries on the list and can extend the lifetime of entries no...
متن کاملThe Relationship Between Non-Transparent Financial Reporting and Risk Stock Futures Fall Due to the Size and Performance
The purpose of this study was to investigate the relationship between stock futures fall risk with non-transparent financial reporting at three levels of size, efficiency and return on equity, in the period 2010 to 2014 was in Tehran Stock Exchange. The population of the study are all companies listed in Tehran Stock Exchange. Data collected and calculated by using Excel software Eviews 7 been ...
متن کاملOn Using an Hybrid MPI-Thread Programming for the Implementation of a Parallel Sparse Direct Solver on a Network of SMP Nodes
Since the last decade, most of the supercomputer architectures are based on clusters of SMP nodes. In those architectures the exchanges between processors are made through shared memory when the processors are located on a same SMP node and through the network otherwise. Generally, the MPI implementations provided by the constructor on those machines are adapted to this situation and take advan...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008